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The Present Application 

By way of brief review, the present application is directed toward methods and systems 
for interactive language instruction. Significantly, the systems and methods provide for flexible 
lesson production in that they support repurposing other materials, such as, news stories . 
web content and specialized documents (page 9, line 29 - page 10, line 2). Other features and 
functions that support interactive learning include the display of visual aids, such as an animated 
image of a human head and face . For example, the face and head are portrayed in a 3- 
dimensional perspective which is adjustable . That is, the image can be rotated and tilted for 
view from various angles . In some embodiments, the 3-dimensional head and face are 
transparent. Accordingly, the student can observe characteristics of facial and mouth 
movements and the placement of the tongue, lips and teeth. For instance, the student can 
observe the animated image from any angle . Normal or transparent modes allow the student to 
observe teeth and tongue placement associated with the pronunciation of text being converted 
to speech by the system. Volume, speed and vocal characteristics associated with the animated 
image can be changed by the student using a computer interface (page 10, line 31 - page 1 1 , 
line 16). The systems and methods also provide a user interface for receiving utterances 
spoken by the user in response to a prompt to replicate the audible speech provided by the text 
to speech conversion. The received utterances are analyzed. For example, the utterances are 
compared to records of the audible speech or are processed and compared to the output of 
models such as predictive models, phoneme models, diphone or dynamically generated models 
(page 12, line 31 - page 13, line 13). Feedback is then provided to the student based on the 
comparison. For example, a confidence measure which may be correlated to customize scoring 
tables is provided to the student. Preferably, the feedback reflects the precision with which the 
user replicates the audible speech (page 13, lines 13-18). 

The Cited References 

All of the rejected claims stand rejected in view of Ferrell taken alone or in 

combination with other references. However, it is respectfully submitted that these rejections 

include errors in fact with regard to assertions about what Ferrell discloses. For instance, 

claims 23 and 24 were rejected under 35 U.S.C. §1 02(a) as being anticipated by Farrell. 

23. A system for interactive language instruction comprising: 

means for converting input text to audible speech in a selected 
language, the audible speech being patterned after a model; 

means for receiving utterances spoken by a user in response to a prompt 
to replicate the audible speech; 

means for recognizing the utterances and providing feedback to the user 
on each sub-word or phoneme portion of the utterances, the feedback being 
comprised of a confidence measure reflecting a precision at which the user 
replicates the audible speech in the selected language based on the comparison 
of the utterances to one of the audible speech and the model, wherein the 
confidence measure is provided as scores for replication of paragraphs, 
sentences, words, sub-words or phonemes. 



1 



LUT2 2 00030-1 
Case Name/No. August 25-1-7-8-1-2-2-4 

However, it is respectfully submitted that Farrell does not disclose or suggest a means for 
converting input text to audible speech in a selected language. Instead, Farrell discloses 
presenting pre-packaged text and/or images to a user fcolumn 7, lines 34-40). 

For example, Farrell explains that, in operation, processor 60 of Farrell includes 
appropriate control logic preferably in the form of software to present visual stimuli, such as text 
and graphics, on display 64. Oral presentation of vocabulary elements utilizes speech 
synthesizer 74 and speaker 76. The user may respond using one or more input devices, such 
as keyboard 66, mouse 78, tablet 62 or microphone 72 (column 7, lines 34-40). However, it is 
respectfully submitted that Farrell does not disclose or suggest inputting text to be converted or 
inputting repurposed text from, for example, arbitrary sources such as news items or web pages . 
Instead, it is respectfully submitted that, that as explained in the portions cited above, Farrell 
produces stimuli, such as text and graphics, from a prepackaged library of such elements and 
waits for a user's response . Therefore, it is respectfully submitted that Farrell does not disclose 
or suggest the means for converting input text to audible speech disclosed in the present 
application and recited in claim 23 . 

Even if the playback of prerecorded speech of Farrell is construed as disclosing a means 
for converting input text to audible speech, it is respectfully submitted that Farrell does not 
disclose or suggest the means for converting input text to audible speech disclosed and claimed 
in the present application. It is respectfully submitted that an element in a claim for a 
combination may be expressed as a means or step for performing a specified function without 
the recital of structure, material or acts in support thereof and such claim shall be construed to 
cover the corresponding, structure, material or acts described in the specification and 
equivalents thereof (35 U.S.C. §112, sixth paragraph). 

For at least the foregoing reasons, claim 23, as well as claim 24, which depends 
therefrom, is not anticipated and is not obvious in light of Farrell. 

Claims 1, 5-7, 9, 11-14 and 20 were rejected under 35 U.S.C. §1 03(a) as being 
unpatentable over Farrell in view of Mostow. 

1. A system for interactive language instruction comprising: 

a first module configured to receive repurposed input text from a 
repurposed source and convert the input text to audible speech in a 

selected language, the audible speech being patterned after a model; 

a user interface configured to receive utterances spoken by a user in 
response to a prompt to replicate the audible speech; and, 

a second module configured to recognize the utterances and provide 
feedback to the user, the feedback being comprised of a confidence measure 
reflecting a precision at which the user replicates the audible speech in the 
selected language based on a comparison of the utterances to one of the 
audible speech and the model, wherein the confidence measure is provided as 
scores for replication of at least one of paragraphs, sentences, words and sub- 
words. 

Arguments similar to those submitted in support of claim 23 are submitted in support of 
claim 1 . It is respectfully submitted that Farrell is not fairly construed as disclosing a first module 
configured to convert input text to audible speech since Farrell does not disclose or suggest 
inputting text. Instead, it is respectfully submitted that Farrell discloses presenting pre-packaged 
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text and/or images to a user (column 7, lines 34-40). 

Furthermore, even if the pre-loading of planned lesson vocabulary of Farrell is read as 
inputting text, the Office Action stipulates that Farrell does not disclose that input text is 
repurposed input text from a repurposed source . The Office Action relies on Mostow (e.g., 
lines 51-61) for disclosure of repurposed input text. 

However, it is respectfully submitted that disclosure of importing text from a tutor from 
another domain (column 8, lines 51-61) relied on by the Office Action, or assertions that the 
invention of Mostow enables content to be created by operating the tutor in an authoring mode 
or during normal tutorial activities, does not disclose or suggest receiving repurposed input (i.e.. 
arbitrary) text from a repurposed (i.e.. arbitrary) source . It is respectfully submitted that the 
purpose of the tutor in the other domain and the text received therefrom is the same as the 
purpose of the tutor in the present domain and said text once received in the present domain by 
the present tutor. That is, in both instances, the purpose of the text is language training. 
Therefore, text received from a tutor in another domain is notfairlv read as being repurposed 
For a discussion of repurposed text, the attention of the Examiner is directed to, for example, 
page 9, line 29 — page 10, line 2, of the present application. 

An advantage of using repurposed text over prepackaged text is that with repurposed 
text, lessons can be easily tailored to the needs of the student. For example, a student 
interested in working in automobile repair can select articles and web pages about cars while a 
student interested in farming can enter text from agricultural web pages and news reports. This 
allows the student to concentrate his or her studies on the most important portions of the new 
language. 

For at least the foregoing reasons, claim 1, as well as claims 5-7, 9 and 11-14, which 
depend therefrom, is not anticipated and is not obvious in light of Farrell and Mostow taken alone 
or in any combination. 

Additionally, in what is understood to be a reference to claim 5, the Office Action asserts 
that Farrell discloses that a vocabulary element may be a phoneme word, sentence or paragraph 
and directs the attention of the Applicants to column 4, lines 45-49, in support of the assertion. 
The Office Action then asserts that a vocabulary element is "a predictive model." This assertion 
is respectfully traversed . 

Farrell and Mostow do not disclose or suggest that a vocabulary element, such as a 
phoneme, word, sentence or paragraph, is properly characterized as a predictive model . For a 
discussion of predictive models, attention of the Review Panel is directed, for example, to page 
3, lines 16-24; page 3, line 30 -- page 4, line 7; and page 13, lines 1-4, of the present application. 
It is respectfully submitted that these portions of the specification make it clear that a predictive 
model is something other than a vocabulary element. For example, on page 13, predictive 
models are identified as one of a variety of models including phoneme models, diphone models 
or dynamically generated models. Therefore, it is respectfully submitted that it is clear that 
predictive models are something other than phoneme models. Moreover, they are something 
different than phonemes. 
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For at least the foregoing reasons, it is respectfully submitted that Farrell and Mostow do 
not disclose or suggest a system for interactive language instruction including a second module 
configured to recognize the utterances and provide feedback to the user, the feedback being 
comprised of a confidence measure reflecting a precision at which the user replicates the 
audible speech in the selected language based on a comparison of the utterances to one of the 
audible speech and a model, wherein the model is one of a predictive model, a diphone model 
and a dynamically generated model, as recited in claim 5 . 

For at least the foregoing additional reasons, it is respectfully submitted that claim 5 is 
not anticipated and is not obvious in light of Farrell and Mostow taken alone or in any 
combination. 

Claims 2-4 were rejected under 35 U.S. C. §1 03(a) as being unpatentable over Farrell in 
view of Mostow and further in view of Henton. 

2. The system as set forth in claim 1 further comprising a third module 
synchronized to the first module for producing a visual pronunciation aid inthe 
form of an animated image of a human face and head pronouncing the audible 
speech. 

In explaining the rejection of claim 2, the Office Action stipulates that Farrell (and 
Mostow?) omit disclosing a third module synchronized to the first module for producing a visual 
pronunciation aid in the form of an animated image of a human face and head pronouncing the 
audible speech and relies on Henton for such disclosure. 

However, while Henton discloses an animated face in the context of animated "intelligent" 
assistants to instruct a user or to tell the user about some event (column 3, lines 33-37), Henton 
does not disclose or suggest that an animated face could or should be used as a visual 
pronunciation aid to aid a language student in learning how pronounce vocabulary elements. It 
is respectfully submitted there is simply no motivation in the art to combine the animated face of 
Henton with the language training system of Farrell other than that found in the present 
application. It is respectfully submitted that the motivation suggested by the Office Action is 
based on impermissible hindsight 

For at least the foregoing reasons, claims 2-4 are not anticipated and are not obvious in 
light of Farrell, Mostow and Henton taken alone or in any combination. 

Additionally, claim 3 recites the animated image of the human face and head portrays a 
transparent face and head. In this regard, the Office Action asserts that Henton discloses a 
face and head, which is a "transparent" line drawing of a human face and head. However, it is 
respectfully submitted that the line drawings of FIG. 3 are simply that line drawings. They do not 
depict or suggest a transparent head, and they do not, for example, better enable an observer to 
determine relative positions of lips, teeth and tongue while pronouncing vocabulary elements 
Indeed, Henton does not even disclose or suggest that the line drawings be used as an 
animation. Instead, the line drawings are simply a tool used for the purpose of the patent 
document to emphasize the most salient features of the ten visemes of FIG. 3 (column 5, lines 3- 
5). 

For at least the foregoing additional reasons, claim 3 is not anticipated and is not 
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obvious in view of Farrell, Mostow and Henton taken alone or in any combination. 

Regarding claim 4, the Office Action asserts that Henton discloses a voice table block is 
utilized by voice synthesizer to provide all needed phones or use aliases for any needed missing 
phones and directs the attention of the Applicants to column 5, lines 42-52, and FIG. 2 in support 
of this assertion. The Office Action then asserts that supplying phones for speech synthesis 
involves controlling one of "the vocal characteristics of the audible speech." However, even if 
the assertions of the Office Action are correct, it is respectfully submitted that one of ordinary skill 
in the art would understand that claim 4 refers to user controls . For example, page 10 of the 
Detailed Action mailed March 17, 2004, asserts that concerning claim 4, Farrell must implicitly 
include at least a volume control for speakers 76. It is respectfully submitted that Henton and the 
assertions of the Office Action do not disclose or suggest providing the user with a means for 
adjusting the vocal characteristics of audible speech. 

For at least the foregoing additional reasons, claim 4 is not anticipated and is not 
obvious in light of Farrell, Mostow and Henton taken alone or in any combination. 

Claim 8 was rejected under 35 U.S.C. §1 03(a) as being unpatentable over Farrell in view 
of Mostow and further in view of Doi. 

However, claim 8 recites the system as set forth in claim 1 wherein the system further 
comprises a mapping of sub-words in a first language to sub-words in a second language for 
illustrating sound-alike comparisons to a student . In explaining the rejection of claim 8, the 
Office Action directs the attention of the Applicants to column 7, lines 13-66, and column 2, lines 
14-49, of Doi. However, column 7, lines 13-36, describes the function and purpose of a data 
display control for selecting translation possibilities. It is respectfully submitted that Doi does 
not disclose or suggest sound-alike comparisons. Instead, the subject matter of FIG. 4 
and the referenced portion of column 7 arerelated to complete word literal translations 
and are unconcerned with teaching the proper sound or pronunciation of words. 
Furthermore. FIG. 4 and the referenced portion of column 7 are unrelated to any discussion of 
sub-words . It is respectfully submitted that similar remarks are applicable to the referenced 
portion of column 2. It is respectfully submitted that Doi is concerned with presenting a plurality 
of possible literal translations to a user . However. Doi does not disclose or suggest the subject 
matter for which it is relied . That is, Doi does not disclose or suggest a mapping of sub- 
words in a first lang ua ge to sub-words in a second language for illustrating sound-alike 
comparisons to a student as recited in claim 8 and discussed in the present application , 
for example, on page 5, lines 7-13, and on page 19, lines 10-18. 

Furthermore, there is no motivation to combine the teaching of Doi with that of Farrell and 
Mostow. Farrell and Mostow do not disclose or suggest performing translation services. 

For at least the foregoing reasons, claim 8 is not anticipated and is not obvious in light of 
Farrell, Mostow and Doi taken alone or in any combination. 

For at least the foregoing reasons, Pre-Appeal Brief Review is requested. 
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Respectfully submitted, 



FAY, SHARPE, FAGAN, 
MINNICH & McKEE, LLP 
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